IMS Collections 

Beyond Parametrics in Interdisciplinary Research: Festschrift in Honor of Professor 
Pranab K. Sen 

Vol. 1 (2008) 350-363 

© Institute of Mathematical Statistics, 2008 
DOI: 10.1214/193940307000000266 



Estimating medical costs from a 
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Abstract: Nonparametric estimators of the mean total cost have been pro- 
posed in a variety of settings. In clinical trials it is generally impractical to 
follow up patients until all have responded, and therefore censoring of patient 
outcomes and total cost will occur in practice. We describe a general longi- 
tudinal framework in which costs emanate from two streams, during sojourn 
in health states and in transition from one health state to another. We con- 
sider estimation of net present value for expenditures incurred over a finite 
time horizon from medical cost data that might be incompletely ascertained 
in some patients. Because patient specific demographic and clinical charac- 
teristics would influence total cost, we use a regression model to incorporate 
covariatcs. We discuss similarities and differences between our net present value 
estimator and other widely used estimators of total medical costs. Our model 
can accommodate heteroscedasticity, skewness and censoring in cost data and 
provides a flexible approach to analyses of health care cost. 



1. Introduction 

Estimating cost from medical follow-up studies has been the focus of extensive 
methodological research. Cost data in observational studies exhibit several features 
such as heteroscedasticity, skewness and censoring that must be addressed in sta- 
tistical modeling so that ensuing inference would be valid. In clinical trials it is 
generally impractical to prolong a study until all patients have responded, and 
therefore inevitably censoring of patient outcomes and total cost will occur in prac- 
tice. Since costs are incurred over time, the cumulative cost C(t) at time t is a 
nonnegativc monotone function. Cost accumulation ends at an event time T, for 
example at death for lifetime cost, or at a specified finite time horizon r. Interest 
lies in estimating the mean cost [i = E(C(T*)) where T* = min(T, r). Because T 
could be precluded from observation by censoring at time U, that is, when T > U , 
the corresponding cost would be complete only if U > T*. Several nonparametric 
estimators of fj, have been proposed in a variety of settings with regression mod- 
els being the mainstay for assessing the influence of patient-specific characteristics 
(eg, treatments, demographics, comorbidity) on cost (for example, Bang and Tsi- 
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atis [2, 3], Baser et al. [4, 5], Lin [13, 14], Lin et al. [15], O'Hagan and Stevens 
[17], Strawderman [18] and Gardiner et al. [8]). 

This article adopts a broader view of the cumulative cost {C(t) : < t < r} 
within the framework of a longitudinal model. Section 2 describes all the substan- 
tive aspects of our models starting with an underlying finite state stochastic process 
for the evolution of patient events as they occur over time. The states arc different 
health conditions that the patient presents over the period [0,r]. Costs emanate 
from two streams, during sojourn in health states and in transition from one health 
state to another. We consider estimation of net present value (NPV) for expendi- 
tures incurred over [0, r]. Regression models for the event history process and for 
observed costs are used to incorporate covariates. Section 3 outlines the method 
of estimation of NPV from a patient sample of time-censored event history data. 
We then discuss similarities and differences between our net present value estima- 
tor and other widely used estimators of total medical costs. Section 4 is a brief 
summary and conclusion. 

2. Stochastic model 

2.1. Transition and sojourn cost 

A stochastic process X = {X(t) : t G T} on the interval T = [0, r] where r < oo, 
describes the health states of a patient from the relevant population under study. 
The time r is the maximum limit of observation for all cost and patient outcomes. 
The state space of X is finite and labeled E= {0, . . . , m} and consists of several 
transient states, such as "well", "recovery", "relapse", and one or more absorbing- 
states such as "dead" or "disabled" . A transient state is one which if visited will 
be exited after a finite sojourn, whereas a transition out of an absorbing state is 
impossible. Costs are incurred while sojourning in a transient health state and in 
transition between states. If the patient is in state h at time t, that is, X(t) = h, 
the expenditure rate is B{t 1 h). If a transition occurs from state h to state j at time 
t, that is, X(t—) = h and X(i) = j, a cost C(t, h,j) is incurred. 

The notation [A] denotes the indicator function of the event A taking value 1 
if A is true and if A is false. For example, to indicate the state of occupation 
just prior to time t we write Yh(i) = [X(t—) = h]. The number of direct transitions 
h — » j, h 7^ j in the time interval [0, t] is Nhj(t) = #{s < t : X(s-) = h, X(s) = j}. 
If r is the discount rate, the present value of expenditures associated with all h — > j 
transitions in T is 



(1) 




and the present value of expenditures for all sojourns in state h in T is 



(2) 




We will interpret all integrals as on the semi-open interval (0, r]. In practice we 
want to estimate the expected values (averages) of these two quantities. To do so 
we impose a Markov model on X to govern the transitions between states. 
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2.2. Markov model 

We call X a non-homogeneous Markov process if 

P[X{t) = j\ X(s) = h, X{u) :u<s] = P[X{t) = j\ X(s) = h] 

for all h,j € E and all s <t. The transition probabilities Phj(s,t) 7 s < t, of X are 
given by Phj{s,t) = P[X{t) = j\ X(s) = h] and the transition intensities ahj(t) by 
a hj {t) = lim A t;o P[X(t + At) = j\ X(t) = h]/At, j ^ h with a hh = - J2^ h a hj - 
Throughout we assume that the cthj are integrable on T = [0, r]. The m x m 
matrices P={Phj(s,t)} and a ={cthj} are related by the product-integral formula 
P(s,t) = rL<M<t(I + ot(u)du). The matrix A={Ahj} of the integrated intensities 
is defined by Ahj{t) = J ahj(u)du. 

2.3. Modeling covariates 

We let the transition intensities depend on a covariate history vector process z(i) 
through a Cox regression model ahj(t\z(t))—ahjo{t)exp(/3' h jZ(t)), where a.hjo(t) is 
an unknown baseline intensity and the regression coefficients flhj are specific to the 
transition h — > j. It is always possible to recast this in terms of a single composite 
regression vector (3 with type-specific covariate vector z^-(t) computed from z(t). 
Then the model for the intensities is 

(3) a hj (t\z(t)) = ahjo(t)exp((3'z hj (t)). 

To make explicit the dependence of P, a and A on a pre-specified fixed covariate 
profile z we will use the notation P(s,i|z), a(t\z), and A(t|z), respectively. 

2-4- Net present value 

Consider the conditional expectation of (1), given fixed z and the initial state 
X(0) =i, i £ E. The expected net present value is 

(4) E(C { h y\X(0) = i,z) = J e- rt c hj (t\z)P ih (0,t-\z)dA hj (t\z), 

where Chj{t\z) = E{C(t, h, j)\X(t— ) = h,z} and integration is on the set (0,t]. 
Justification for (4) could be made as follows. Starting in state i at time zero a 
patient will be in state h at time t with probability Pih(0,t\z). Conditional on 
being in state h just prior to t, a transition to state j occurs at t with intensity 
cthj(t\z) and this transition incurs a cost whose average is Chj(t\z). We call (4) the 
net present value (NPV) for all h — > j transition costs in T. An entirely analogous 
argument applies to the conditional expectation of (2) , given z and the initial state 
X(0) — i, which results in the NPV for all sojourn costs in state h in T. We get 

(5) E(C { h 2) \X(Q) =i,z)= f e- rt b h (t\z)P ih (0, t - \z)dt, 

Jo 

where bh(t\z) = E{B(t,h)\X(t—) = h,z}. The interpretation of (5) is as follows. 
Starting in state i at time zero a patient will be in state h just prior to time t with 
probability Pih(0,t — |z). While sojourning in state h in the interval (t, t + dt] an 
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average cost bh(t\z)dt is incurred. Then the right hand side of (5) is a weighted sum 
of these costs in [0,t]. 

By averaging with respect to the initial distribution, 7Tj(0|z) = P[X(0) = i\z), 
i G E we combine (4) and (5) to obtain the unconditional NPV, 

NPV(z) = ^ 7 r i (0|z)NPV(z,i), 

where 

(6) NPV(z,i)=V/ e- rt c hj (t\z)P ih (0,t - \z)dA hj (t\z) 

+ V f e~ rt b h (t\z)P lh (0, t- \z)dt. 
h J ° 

Since there is no cost accumulation in absorbing states (e.g., no costs incurred 
after death) the second summation is over all transient states in E. The first sum- 
mation includes both transitions between transient states and transitions from 
a transient state to an absorbing state. We can simplify (6) further by defining 
c* h (t\z) = J2j^h Ch j(.t\ z ) ah 3(t\ z ) an d rewriting the first term as 

V f e- rt c* h (t\z)P lh (Q, t-\z)dt. 
h Jo 

This is similar to the second term in (6). 

Equations (4)-(6) place a structure to the accumulating costs by considering 
costs incurred at transitions separately from costs incurred during sojourns. In 
general, we might consider two non-negative, non-decreasing, right-continuous pro- 
cesses {Vfc(f) : t £ T, k = 1, 2} to represent the cost accumulation which is assumed 
to end at time t, or prior to r if an absorbing state has been entered. For costs 
incurred at transition times, V\(t) = J2h^j^2u<t e ~ ru C(u,h,j)ANf l j{u), and for 

costs incurred during sojourn in states we have V^i) = ^2 h J * e~ ru B(u, h)Yh(u)du. 
2. 5. Censoring 

Observation of X will cease at time r unless an absorbing state was entered prior 
to t. Also censoring might occur at some random time U, which limits observation 
up to r A U. We assume U is independent of X and replace Nhj(t) by the censored 
process Nhj{t A U) and the state indicator Yh{t) by Yh{t) = [X(t—) — h,U > t\. 
Therefore, for the process X the information Tt revealed up to time t is generated 
from X(0),z(0), and {z(u), Y h (u) 7 N h3 (u) : u < t A U,h ^ j,h,j £ E}. For costs 
incurred at transition times the information known up to time t is 

{C hj (u)AN hj (u) :u<tAU,h^j,h,jeE}, 

whereas for sojourns we would know at best the cumulative costs 

rtAU 

{ B h (u)Y h (u)du,heE}. 
Jo 

In both cases censoring limits what we can observe. If U precedes both r and the 
time of absorption then total costs are not observed. Furthermore, the observational 
scheme might restrict observation of sojourn costs to only completed sojourns or 
at a finite number of time points during the sojourn. We assume that censoring 
completely random in the sense that U is independent of (X(t), V(t) :t>0). 
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2.6. Survival time 

Label the states in E so that 0, . . . , to — 1 arc transient and to is absorbing (e.g.. the 
state 'dead'). Survival time r m = infjt > : X(t) — to} is the time to absorption. 
The survival distribution, conditional on X(0) = i, is 

S mi (t\z) = P[r m > t\X(0) =i,z]=l- P im (0, t\z), 

and the unconditional survival distribution is 

S m (t\z) = 1 - ^(0|z).R„„(0,i|z). 

In the special case of one transient state ("alive") and one terminal state 1 
("dead") we get the usual survival time T(= t\) and its survival distribution 
S(t\z) = P[T > t\z] = P OQ (0,t\z). 

3. Estimation 

Suppose we observe the aforementioned processes for each of n subjects in a lon- 
gitudinal study. For the i-th patient the basic covariate vector is %i(t), the initial 
state Xi(0), the state indicator Yhi{t) = [Xi(t— ) = h, Ui > t] and the number of 
direct h — > j transitions 

Nhji(t) = #{u <tAUi-. Xi(u-) = h, Xi(u) = j}, h £ j. 

Conditionally on {zi(0),Xi(0) : 1 < i < n} assume processes {Xi(t) : t G T} arc 
independent and that model (3) holds for each individual with the same baseline 
intensities. From now on denote by Nhj{t) and Yh(t), respectively, the aggregated 
processes X)T=i Nhji{t) and X)"=i Yhi(t). In this context estimation of the transi- 
tion probabilities Phj(0, t\z) and integrated intensities Ahj(t\z) at a fixed covariate 
profile z is well known (Andersen et al. [1]). Combining this with appropriate esti- 
mation of costs would lead to estimators of NPV. However, before we describe an 
approach to estimation wc first consider several examples. 

3.1. Single transition without covariates 

The only permissible transition — > 1 is associated with a single cost C(T, 0,1) 
(denoted here by y) where T denotes the survival time. From (6) we have 

NPV= f e- rt c 01 (t)P o(0,t-)dA 01 (t). 
Jo 

To estimate NPV we use the estimators 

Poo(0,t-) = S(t-) 

and 

dA 01 (t) = {y (t)}- 1 d7v 01 (i), 

where S is the Kaplan-Meier estimator of the survival distribution of the survival 
time T(= n), Y Q (t) = £™ =1 [T; AU t >t] and N 01 (t) - E?=iPi < t AUi}. If there 
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are no ties in the survival times Tj, the natural estimator of coi is coi(Xi) = yi. 
Therefore our estimator of NPV is 

(7) NPV= [\-r% 1 ( t t)^dN 01 (t) = f2^ rTi yi^^m<U i Ar}. 

If G denotes the Kaplan-Meier estimator of the survival distribution of Ui, and 
using the fact that S(t—)G(t—) = n _1 Yo(£) if there are no ties between survival 
and censoring times, then (7) can be rewritten as 

(8) NPV = n- 1 f T e- rt ^± dNoi{t) = „-i e -^ y . [T . < v . A T ] /&&-). 

Jo G(t-) . =1 

Using the consistency of the Kaplan-Meier estimator G we see that NPV converges 
to E(e~ rT y[T < r]) provided G(t— ) > 0. Therefore in the absence of discounting 
NPV estimates the average cost restricted to r. In this case (8) with r = is the 
mean cost estimator described by Bang and Tsiatis [2] and Zhao and Tian [21]. 

If there are ties in the survival times and < t\ <...<<*< r are the distinct 
observed times, then £oi(t*) = y* is the mean of the observed costs at time t* and 
the right-hand side of (8) is 

j:t*<T 

where dj is the multiplicity of t* . 



3.2. Single sojourn without covariates 



A single sojourn begins in state and ends with transition to state 1 at time T. 
Sojourn cost is incurred through time T* = min(T, r). From (6) the NPV of interest 
is 

fT fT 

(9) NPV= / e- rt S(t-)b (t)dt= / S(t-)dm(t) 

Jo Jo 

where m(t) — f Q e~ ru b^{u)du. Allowing for an initial cost at t = 0, integration-by- 
parts yields 

(10) NPV + m(0) = E(m(T*)) = f m(t)(-dS(t)) + m(r)5(r) 

where m(0) is the expected initial cost. In the absence of discounting (r = 0) and 
ignoring covariates, Strawderman [18] considers the nonparametric estimation of 
NPV based on observations on (censored) survival times and accumulating costs 
in [0, t}. For the i-th subject the observed data are (Ni(t), Yi(t), Vi(t) : t < t), 
where Vi(t) is the accumulated costs up to time t, Ni(t) = [Ti < t,Ti < Ui], and 
Y i {t) = [T i AU i >t]. 

Define R(t,u) = E(Vi(t)\Ti > u) for t > u and estimate 

m[t) = [ E{R{du,u)\T >u) 
Jo 
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by 



(t) = V / {Yoiu^Y^dViiu). 
.-_ i •'0 



i=l 

This leads to the estimator of NPV, 

(11) NPv = ^/ ^t-^WVW'W) 

where, as before S is the Kaplan-Meier estimator of S and io(i) = S"=i^(*)- 
To include in (11) a cost at t = we could add the term rT 1 Yl?=i "^(0) as th e 
estimator of m(0). Because 

the right hand side of (10) would be estimated by 

n 

Y,rh(T i )S(T l -){Y (T i )}- 1 [T l < U t A r] + m(r)5(r). 

i=l 

Expression (11) is useful when the accumulating cost history is observed. 

3.3. Single sojourn without covariates with restricted cost history 

Suppose the cost accumulation process Vi(t) is observed at fixed time points {do, 
...,a G } where = a < ai < • ■ • < a G = r. Let V ig — V t (a g ) - V^(a s _i). If 
observation goes past a g then V ig is observed. If Tj £ (a g _i, a 9 ] then Vi(a s ) = Vi(Ti) 
and if Tj < a ff _i, Vi 9 = 0. When censoring occurs in (a ff _i, a g ] the true incremental 
cost in the interval is not known. We only observe Vi g = Vi{Ui) — Vi(a g -i). In all 
other cases we define Vi g = Vi g . Regarding dVi(t) in (11) as a discrete measure with 
mass Vi g at t = a g -i we obtain 

a G 

(12) NPV = ^S(a ff _ 1 -)y - 1 (a ff -i)^F i (a ff -i)^ s . 

9=1 1 = 1 

This estimator was introduced by Lin et al. [15]. By the weak law of large numbers 
and the independence of Ui with Ti and Vi(t) 

n 

V'K-OE^^" 1 )^ E(Y(a g ^)V ig )/E(Y(a g ^)) = E(Vi g \Ti > a g -i). 
i=i 

Because Vi g differs from Vi g when there is censoring, (12) converges to 
G 

Y^Siiag-^E^Ti > o 9 _i) — E* = E(Vi(r)) - E* 

9=1 

where E* = E{(V t (a g ) - V l {U i ))[U l < T t A a s ]|C/« > a s _i}. Hence there is 

downward bias in estimating the mean cost E(Vi(r)). If censoring does occur close 
to the right endpoint of the intervals this bias is likely to be small. 
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3-4- Regression model-based estimates of NPV 

For the z-th subject let Yi = (yu, ■ ■ . , Vim)' denote the costs for the sojourns ending 
at chronologically ordered times U = (tn, . . . , ^ n; )' . If the last sojourn has not ended 
with transition to an absorbing state, we will define ti ni = r so that yi ni is the cost 
associated with the period [ti m -i, t]. Censoring would also preclude observation of 
some sojourn costs. Observation ends in one of three ways: (1) censoring occurs at 
Ui before r, (2) an absorbing state is reached before t, or (3) observation goes past 
r. The cost yi g associated with the g-th sojourn interval ,ti g ] is observed if 

s ig = 1 where s lg = [Ui > t ig A t). 

Let Si denote the diagonal matrix of the {si g , g = 1, ... ,71*} and X.; = (Xji, . . . , 
Xmi)' be a n.i x p matrix of covariates associated with Y.;. The components of Xi g 
contain covariates that are fixed over time as well as covariates that vary with time, 
but only through (tn, . . . , ti g ). In particular Xi g will contain functions of t% g —i, U g . 
The conditional mean vector and covariance matrix are denoted, respectively, by 
fi i = _E(Yj|Xi), Vj = E[(Yi — Hi){Yi — ju i )'|Xj]. We impose strict exogeneity on 
the conditional means [ii g = E(yi g \Xi) that requires [ii g to be a function of Xi g 
only, that is, S(j/j 9 |xji, . . .Xi„J = E(yi g \x.i g ) for all g = 1, ...,7ij. Independence 
across subjects is assumed, in fact that {(Yi,Xi,Si) :l<i<ri}isa random 
sample. The total number of records in the sample is N — X)"=i n i- 

Let h be a link function such that h(^i g ) = X-' ig [3 where (3 is a p x 1 vector of 
unknown parameters. (The (3 here is not the same as the regression parameter in the 
intensity model (3) of section 2.3.) The n.; x p matrix D; of derivatives can be 
expressed as D; = D°X^ where D° is the diagonal matrix with elements (dh/dx) -1 
evaluated at x = fii g . Assuming Vj is positive definite we may write = L^L' ; 
where is the unique lower triangular matrix with positive diagonal elements. 

~ 1/2 1 1/2 1 

Make the transformations Y.; = w/ L f Y^, fi i = w i h i fi i where is the 
diagonal matrix with elements Wi g = Si g /p(U g A r— , z,;) and p(t, z^) = P[Ui > i|z,;]. 
Here z^ are fixed covariates that model the censoring distribution. They may differ 
from the components of X.;. Given z^, assume Ui is independent of (Y,;,Xi,ti). 
Then E(s ig \Yi, X ?; , t it z ?; ) = P[U, > t lg A r|z 4 ] and E(w ig \Yi, X,;, U, z ?; ) = 1 under 
the assumption p(r—, z^) > 0. 

An estimator of /3 is obtained by minimizing the sum of squares q(Yi, w.;, X.;) = 
Y^i—iO^i ~ A»)'(Yt — Ai) with respect to (3 which leads to the estimating equation 

n 

(13) ^DKL^)' Wi (Lr 1 )(Y i - / x,)=0. 

i=i 

Because 

= J B[D^(Lr 1 )'i : ;(w l |Y 4 ,X t ,t ? ;,z l )(Lri)(Y t - / x. i )] 
=^V 1 - 1 (Y i -/x i )]=0 ) 

(13) provides a consistent estimator /3 of /3. The transformation of Yi — ju 4 and D; 
by w^L" 1 preserves time order and effectively uses only uncensored data in (13). 
In the absence of censoring we would use the estimating equation 

n 

^D; ; v- 1 (Y 4 - Atl ) = o. 

i=l 
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Hence (13) is the generalized estimating equations (GEE) analog for the selected 
sample {(Yj,Xj,Sj) : 1 < i < n}. 

Following the standard GEE methodology, n 1 ' 2 (/3 — jS) is asymptotically normal 
with zero mean and covariance matrix A^BA^ 1 where 



A = E 

(14) 



df3' 

B = E[S l (w t ,Y l ,X l ,f3)S' l (^ t ,Y l ,X l ,P)] 



and Si(wi, Y,,X,,/3) = D-(L i 1 )'wi(L i 1 )(Yi — /i ?; ). Consistent estimators of A 
and B are obtained by replacing the expectations in (14) by their sample averages 
and /3 by (3. In addition, we also need a consistent estimator of V, = L,L^ and 
the censoring distribution Zj). Methods for their estimation arc suggested in 
specific contexts in Lin [13, 14], Baser et al. [5] and Gardiner et al. [8]. 

Another approach is to estimate a random-effects (RE) model for Yj (or a trans- 
formation of Yi) given by 

(15) Yi = Xifi + aiU + Ui 

where p is an unknown pxl parameter, lj the Hi x 1 vector with all elements equal 
to 1, a,i an unobserved patient-specific heterogeneity and Ui is the n,; x 1 vector of 
idiosyncratic errors. The composite error is Vj = ajl, + Ui. Assume $7, = i?(viV-) 
is positive definite and that the standard RE assumptions (Wooldridge [20]) hold: 

(a) £( Uj |X 4 ,a t ) = 0, E(ai\Xi)=0, 

(b) rank ^(Xjn^Xi) = p, 

(c) i?(u,Ui|Xj, di) = a 2 Ii, E(af\Xi) = a 2 a where a\ and a\ are constants and I; 
is the rii x identity matrix. Therefore E(yi) = and f2i = a\ L + a^3i where J, 
is the rii x rii matrix with all elements equal to 1. 

To estimate P in (15) from censored observations on costs we first transform 
(Y^ Xj, Vj) to (Yj, Xj, Vi) where v, = w^L^Vj and Yi, Xj arc similarly defined. 
Here Li is the unique lower triangular matrix with positive diagonal elements such 
that fij = LjL^ . The objective function for estimating P is 

g(Yi,Wi,Xi) = {wV^r^Yi - X i) 9)} / {w i 1/a L i - 1 (Y i - Xj/?)}. 

Specializing (13) leads to the generalized least-squares (GLS) weighted estimator 
p w given by 

(16) p w =(j2x\x)i (^X'iYiV 



From (16) we get the consistency of p w and 

(17) n x / 2 {p w -0)-> N(0, A^BA" 1 ) 

where A = ^(Xjn^Xi) and B = £(X'jVjV<Xj). 



3.5. Estimation of NPV 



From our model (15) for all transition costs we obtain estimates of Chj(t\z) for a 
covariate profile z by specifying the covariates Xp corresponding to column positions 
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in X. The row vector of X, in our model for j/jj will contain the fixed covariates 
Xj, dummies for transitions types, terms of modeling the transition times such as 
tij, tfj and perhaps interactions between these times and x^. Our special Xo will 
contain the desired z, interactions between z, t and i 2 , indicator variables with value 
1 for transition type h — > j, and value for all other transition types. Denoting 
this covariate profile by Xhjo(t) then Chj{t\z)= x' h j (i)f3 and from (16) we obtain 
the estimator 

(18) c hj (t\z)=x.' hjo (t)0 w . 

Although the consistency of Chj(t\z) might seem immediate from (18) the final 
form of the computable (3 W involves the estimated fij and weights w,-, the latter 
through the censoring distribution G. A formal verification is not attempted here, 
but see Baser et al. [5] for a similar context. 

Now recall the expected net present value E(Cj^\X(0) = i,z). Plugging in 
estimators for the entities on the right hand side in (4) leads to 

(19) E(C$(t)\X =i,x)=J e- rt c hj (t\z)P ih (0,t-\z)dA hj (t\z). 

The estimation of E^C^ |A(0) = i, z) is entirely analogous except that one must 
deal with the quantity bh(t\z) which is the expected mean rate of expenditures at 
time t while sojourning in state h. In practice it will not be observable unless discrete 
information is available. Instead, we will know only the total cost of the sojourn. 
For example, consider hospital costs for patients undergoing coronary artery bypass 
surgery Expenditures are incurred in various care units such as the intensive care 
unit, cardiac care unit and in recovery. We would know the entry and exit dates 
for each unit and the associated cost of the length of stay in each unit, but not 
necessarily the cost per day. An application modeling treatment cost rates in cancer 
patients is discussed in Gardiner et al. [8] using a model for the log-transformed rate 
of cost accumulation y^- = y*j/(tij — Uj-i) between consecutive transition times 
Ui,ti2, ■ . • where y*j the sojourn cost in [tij-i,tij). 



3.6. Single transition with covariates 

Consider the same scenario discussed previously in 3.1 with all patients starting 
in state "0" and followed until they reach the terminal state "1" (dead). For the 
i-th patient Ti is the survival time and Ui the censoring time. Observation ceases 
at min(Ti, Ui, r), that is, either at the failure time, or censoring time or the limit 
of observation. The only cost incurred is yi = yi{T{) at time Ti which is observed 
if Si = 1 where Sj = [U A r > Ti]. Let Xj denote a p- vector of fixed covariates of 
interest and Zj denote fixed covariates used for modeling the censoring distribution. 
Assuming independent censoring, that is, given Zj, Ui is independent of (j/j,Xj,Tj), 
we get 

P[si = %,yi, Xi) Ti] = P[Ui > T i} Ti <r\zi,Ti] = G(T t - |z,)p} < r]. 

Defining u>i = Si/G{Ti~\zi) we see that (13) reduces to minimizing with respect to (3 
the objective function nr 1 Y^=i ^iVi,™*,^) where q(yi, Wi, x. t ) = a~ 2 Wi(yi -x-/3) 2 . 
This yields the estimator W in (16) which in this case is 

/ n \ _1 n 

\i=l / i=l 
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This is the same estimator described by Lin [13] except for a slight difference in the 
weights. Because Lin [13] uses a model in which costs are incurred through time 
TjAThis censoring indicator is s* = [Ui > T^At] and weight w* = s*/G(T, At- |z,). 
In our case the cost is realized at time Ti if and only if s% — 1. 

Let Zo denote a fixed covariatc at which NPV(zo) is to be estimated. Since "0" 
is the initial state and the only transition is 0— >1, Poo(0,t — \zq) = S(t — |zo) and 
5(t|zo) = exp(— Aoi(t\zo)). Here S is the estimator of the survival distribution S 
of T, the time of transition. From (18) and (19) our estimator of NPV(zo) is 

NPV(z ) =p' w [ T e~ rt Mt){-dS(t\zo)}, 
Jo 

where x (t) is the covariate vector derived from z and terms used to model time 
(such as t,t 2 ) in the cost equation yi = + u^. Here a single cost = yi(Ti) is 
incurred at Tj, if observed by time r. Then 

E[y l {t)\z ,T. l = t]=^{t)P 
and ^ 

NPV(z )= / e- rt E[y i (t)\z Q ,T i =t]{-dS(t\z Q )} 
Jo 

simplifies to Eie-^'y,^,)^ < r]|z ). 

Since f3 w — * (3 W in probability and uniformly on [0, r], S(-\zq) — > S'(-|zo) 

A 

in probability if S(t\zq) > 0, we obtain the consistency of NPV(zo) provided 
/ e~ rt X-o(t)dS(t\zo) is finite. Also in estimating (3 W we require 

P[Ti < r|z ] = 1 - S(t|z ) > 0, 

because otherwise the cost equation will be vacuous since with probability 1 no 
transition takes place in [0, r]. 

3.7. Single sojourn with covariates 

Suppose the interval [0,r] is partitioned by the fixed points aj, j = 0, . . . , K with 
= ao < a\ < . . . < ax = t. If the expected rate of cost accumulation is constant 
in the intervals (0^-1,0^) with values bj we have 

NPV(z ) ^^bj / e- rt S(t\z )dt. 

j=l J<*j-i 

The integral is the increment over (a 3 -_i,aj) in discounted life expectancy 

LE(z ,t) = j e- ru S(t\z, )du. 
Jo 

Following Baser et al. [o] we could use a RE model for cost = (yn, . . . ,yiK)' 
incurred by the i-th patient. Here yij is the cost incurred in interval (dj—i, a-j) which 
is observed provided Sij = 1 where Sy = [Ti A Ui > a,j] + [oj-i <Ti < UiAaj]. This 
reflects the two cases: (1) the patient neither died nor was censored before ctj, or 
(2) death was observed in ay). Under the assumed independence of censoring 

P[sij = l\zi,Ti\ = G(T* - \zi)[T t > dj-x] where T* = min(T l ,a J ). The regression 
model for y^ will include interval-specific elapsed time T*j — a-j-i which will yield 
an estimator of bj that may depend on z . 
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4. Discussion and summary 

The estimation of medical costs has received considerable attention because of its 
importance in assessing cost-effectiveness of medical interventions and treatments. 
Facing constrained healthcare budgets government planners and policy makers are 
forced to consider the costs of competing interventions in addition to claims of their 
clinical efficacy. The difference in the expected cost of two competing interventions is 
the numerator of the cost-effectiveness ratio, the denominator being the incremental 
health benefit as measured by life expectancy or by quality- adjusted life years (Chen 
and Sen [6, 7] and Gardiner et al. [9]). The cost-effectiveness ratio can be used to 
compare competing interventions with respect to both their health benefits as well 
as their cost. 

Obtaining reliable and valid estimates of costs is imperative. In this article we 
adopted a longitudinal framework in which patient costs are manifested dynami- 
cally over time. An underlying finite-state stochastic process describes the evolving 
patient history with costs incurred at transition times between states and during 
sojourn in states. In this framework we showed how net present values are defined, 
following the basic notions of actuarial values used extensively in the insurance and 
finance literature (Norberg [16]). For example, in the classic disability model there 
are "able" periods and "disabled" periods. The individual holding a disability in- 
surance policy would receive a fixed payment stream over the period of his or her 
disability. In able periods the individual would pay the fixed premium in accordance 
with the policy. There are three policy states -"able", "disabled", and "dead". A 
fundamental difference in our context is that costs are not fixed but random. One 
may regard the total cost over a specified period as the sum of all transition costs 
and sojourn costs. 

Several methods have been proposed to estimate medical cost from follow up 
data. The primary focus has been on a single cost measure that might be incom- 
pletely ascertained due to time censoring (Bang and Tsiatis [2], Baser et al. [4], Lin 
et al. [15], Strawderman [18] and O'Hagan and Stevens [17]). Regression analyses 
allow for assessing the influence of explanatory variables on some measure of the 
cost distribution, such as the mean or median (Bang and Tsiatis [3], Baser et al. 
[5], Lin [13, 14] and Gardiner et al. [8]). Apart from addressing the incomplete- 
ness of cost data, the ability to observe costs over finer time periods can serve to 
strengthen ensuing analyses. For instance, consider the cost of a treatment which 
is assumed to last at most for one year and costs are monitored monthly. If cither 
the endpoint is reached before the end of the year, or observation lasts one year, 
the total cost is observed. If there is censoring of the endpoint before the end of the 
year, we could use the monthly costs, except for the last month of observation, to 
improve our estimate of the average cost of treatment. 

The methods discussed here for analyses of medical costs may be adapted to 
estimate other summary measures used in cost-effectiveness analyses (Gardiner 
et al. [!)]). For example, quality-adjusted survival is defined by using a quality 
weight q(h, t) which represents the utility, relative to the state of perfect health, 
of each unit of time spent in state h = X(t) at time t. Perfect health has a 
quality weight 1, while death or states judged equivalent to death get a quality 
weight of 0. The total quality adjusted time in [0, r] is J2heE Jo e ~ rt Q.{h,t)Yfi{t)dt. 
Hence conditional on X(0) = i we define the expected quality adjusted life years, 
QALYt (z) =J2 hl£E J^e- rt q(h,t)P ih (0,t~\z)dt. The unconditional version is given 
by QALY(z) = J2 ieE 7i"i(0|z) QALYi(z). This expression is similar to the second 
term of the NPV in (6). 
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The transition model adopted in this article extends the simpler two-state sur- 
vival model with a single transition and sojourn. The underlying analysis of survival 
times is now replaced by the analysis of multiple event times which is facilitated by 
using a non-homogeneous Markov model to govern the movement between states, 
and a multiplicative intensity model to incorporate covariatc effects. For the anal- 
ysis of longitudinal cost data, techniques such as inverse probability weighting to 
account for censoring can be applied (Willan et al. [19]) but a more careful con- 
sideration is required to combine the two parts of the model, the transition model 
for the event times and a regression model for costs. Methods for joint modeling 
of longitudinal observations and event times could be adapted for this purpose 
(Henderson et al. [10] and Hogan and Laird [11, 12]). 

Acknowledgments. We thank the referees for their careful reading of the manuscript. 
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